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(54) Monomeric protein of the TGF-beta family 

(57) The present invention is concerned with pro- 
teins selected from the members of the TGF-p super- 
family, which are monomeric due to substitution or de- 
letion of a cysteine which is responsible for dimer for- 
mation. 

The invention is also concerned with nucleic acids, 



encoding such monomeric proteins, vectors or host cells 
containing the nucleic acids as well as with pharmaceu- 
tical compositions comprising the proteins or nucleic ac- 
ids encoding the proteins. The pharmaceutical compo- 
sitions can be applied advantageously for all indications 
for which the respective dimeric proteins are useful. 
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Description 

[0001] The present invention concerns a biologically active protein from the TGF-p superfamily, wherein this protein 
remains in monomeric form due to substitution or deletion of a cysteine which is reponsible for the dimerization in the 

5 wild-type protein. Further the invention concerns a nucleic acid, which codes for a protein according to the invention, 
an expression vector containing such nucleic acid and a host cell, containing a corresponding nucleic acid or an ex- 
pression vector, said nucleic acid being suitable for the expression of the protein. The invention also concerns a phar- 
maceutical composition containing the protein according to the invention or a nucleic acid coding therefor. The use of 
the pharmaceutical composition according to the invention concerns the prevention or treatment of all conditions which 

10 can also be treated with the dimeric form of the corresponding protein. 

[0002] Many growth factors from the TGF-p superfamily (Kingsley, Genes and Development 8, 133-146 (1994) as 
well as the references cited therein) are relevant for a wide range of medical treatment methods and applications which 
in particular concern promotion of cell proliferation and tissue formation, including wound healing and tissue reproduc- 
tion. Such growth factors in particular comprise members of the TGF-p (transforming growth factor, cf. e.g. Roberts 

is and Sporn, Handbook of Experimental Pharmacology 95 (1 990), page 41 9-472, editors: Sporn and Roberts), the DVR- 
group (Hotten et al., Biochem. Biophys. Res. Comm. 206 (1995), page 608-613 and further literature cited therein) 
including BMPs (bone morphogenetic protein, cf. e.g. Rosen and Thies, Growth Factors in Perinatal Development 
(1 993), page 39-58, editors: Tsang, Lemons and Balistreri) and GDFs (growth differentiation factors), the inhibin/activin 
(cf. e.g. Vale etal., The Physiology of Reproduction, second edition (1994), page 1861-1878, editors: Knobil and Neill) 

20 and the GDNF protein family (Rosenthal, Neuron 22 (1999), page 201-203; Airaksinen et al. Mol Cell Neurosci 13 
(1999), page 313-325). Although the members of the TGF-p superfamily show high amino acid homologies in the 
matu re part of the protein, in particular 7 conserved cysteines, they show considerable variations in their exact functions. 
Often individual growth factors of these families exhibit a plurality of functions at the same time, so that their application 
is of interest in various medical indications. Some of these multifunctional proteins also have survival promoting effects 

25 on neurons in addition to functions such as e.g. regulation of the proliferation and differention in many cell types (Roberts 
and Sporn, supra; Sakurai et al., J. Biol. Chem. 269 (1 994), page 141 18-14122). Thus e.g. trophic effects on embryonic 
motoric and sensory neurons were demonstrated for TGF-p in vitro (Martinou et al., Devi. Brain Res. 52, page 175-181 
(1990) and Chalazonitis et al., Dev. Biol. 152, page 121-132 (1992)). In addition, effects promoting survival are shown 
for dopaminergic neurons of the mid-brain for the proteins TGF-p-1 , -2, -3, activin A and GDNF (glial cell line-derived 

30 neurotrophic factor), a protein which has structural similarities to TGF-p superfamily members, these effects being not 
mediated via astrocytes (Krieglstein et al., EMBO J. 14, page 736-742 (1995)). 

[0003] Interesting members of the TGF-p superfamily or active variants thereof comprise the TGF-p proteins like 
TGF-p1 , TGF-p2, TGF-p3, TGF-P4, TGF-P5 (U.S. 5,284,763; EP 0376785; U.S. 4,886,747; DNA 7 (1 988), page 1 -8), 
EMBO J. 7 (1988), page 3737-3743), Mol. Endo. 2 (1988), page 1186-1195), J. Biol. Chem. 265 (1990), page 

35 1089-1093), OP1, OP2 and OP3 proteins (U.S. 5,011,691, U.S. 5,652,337, WO 91/05802) as well as BMP2, BMP3, 
BMP4 (WO 88/00205, U.S. 5,013,649 and WO 89/10409, Science 242 (1988), page 1528-1534), BMP5, BMP6 and 
BMP-7 (OP1) (Proc. Natl. Acad. Sci. 87 (1990), page 9841-9847, WO 90/11366), BMP8 (OP2) (WO 91/18098), BMP9 
(WO 93/00432), BMP10 (WO 94/26893), BMP11 (WO 94/26892), BMP12 (WO 95/16035), BMP13 (W095/1 6035), 
BMP1 5 (WO 96/3671 0), BMP16 (WO 98/1 2322), BMP3b (Biochem. Biophys. Res. Comm. 219(1 996), page 656-662), 

40 GDF1 (WO 92/00382 and Proc. Natl. Acad. Sci. 88 (1991), page 4250-4254), GDF8 (WO 94/21681), GDF10 
(WO95/10539), GDF11 1 (WO 96/01845), GDF5 (CDMP1, MP52) (WO 95/04819; W096/01316; WO 94/15949, WO 
96/14335 and WO 93/16099 and Nature 368 (1994), page 639-643), GDF6 (CDMP2, BMP13) (WO 95/01801, WO 
96/14335 and WO95/16035), GDF7 (CDMP3, BMP12) (WO 95/01802 and WO 95/10635), GDF14 (WO 97/36926), 
GFD15 (WO 99/06445), GDF16 (WO 99/06556), 60A (Proc.Natl. Acad. Sci. 88 (1991), page 9214-9218), DPP (Nature 

45 325 (1987), page 81-84), Vgr-1 (Proc. Natl. Acad. Sci. 86 (1989), page 4554-4558) Vg-1, (Cell 51 (1987), page 
861-867), dorsalin (Cell 73 (1993), page 687-702), MIS (Cell 45 (1986), page 685-698), pCL13 (WO 97/00958), BIP 
(WO 94/01 557), inhibin a, activin pA and activin pB (EP 0222491 ), activin PC (MP1 21 ) (WO 96/01 31 6), activin pE and 
GDF12 (WO 96/02559 and WO 98/22492), activin pD (Biochem. Biophys. Res. Comm. 210 (1995), page 581-588), 
GDNF (Science 260 (1993), page 1130-1132, WO 93/06116), Neurturin (Nature 384(1996), page 467-470), Persephin 

50 (Neuron 20 (1998), page 245-253, WO 97/33911), Artemin (Neuron 21 (1998), page 1291-1302), Mic-1 (Proc. Natl. 
Acad. Sci USA 94 (1997), page 11514-11519), Univin (Dev. Biol. 166 (1994), page 149-158), ADMP (Development 
121 (1995), page 4293-4301), Nodal (Nature 361 (1993), page 543-547), Screw (Genes Dev. 8 (1994), page 
2588-2601). Other useful proteins include biologically active biosynthetic constructs including biosynthetic proteins 
designed using sequences from two or more known morphogenetic proteins. Examples of biosynthetic constructs are 

55 disclosed in U.S. 5,011,691 (e.g. COP-1, COP-3, COP-4, COP-5, COP-7 and COP-16). The disclosure of the cited 
publications including patents or patent applications are incorporated herein by reference. 

[0004] The occurence of proteins of the TGF-p superfamily in various tissuous stages and development stages cor- 
responds with differences with regard to their exact functions as well as target sites, life span, requirements for auxiliary 
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factors, necessary cellular physiological environment and/or resistance to degradation. 

[0005] The proteins of the TGF-p superfamily exist as homodimers or heterodimers having a single disulfide bond. 
This disulfide bond is mediated by a specific and in most of the proteins conserved cysteine residue of the respective 
monomers. Up to now it was considered as indispensible for the biological activity that the protein is present in its 

s dimeric form. Several publications indicated that biological activity can only be obtained for dimeric proteins and it was 
speculated that this dimer formation is important for further polymer formation of two or more dimers to achieve inter- 
cellular signal transmission by simultaneous binding to type I and type II receptors for the TGF-p superfamily proteins 
on cells. It was assumed that only this simultaneous binding to both kinds of receptors would allow for effective inter- 
cellular signal transmission for the benefit of the patient (Bone, volume 1 9 (1 996), page 569-574). 

w [0006] A disadvantage of the use of these proteins as medicaments and their production is, that they are not readily 
obtainable in biologically active and sufficiently pure form by recombinant expression in prokaryots without intensive 
renaturation procedures. 

[0007] Thus it was the object of the present invention to provide a simple and inexpensive possibility to reproducibly 
produce proteins exhibiting high biological activity, wherein this biological activity should essentially correspond to that 
is of the dimers of the proteins of said families. 

[0008] This object is solved according to the invention by a protein selected from the members of the TGF-p protein 
superfamily, such protein being necessarily monomeric due to substitution or deletion of a cysteine which is responsible 
for dimeric formation. 

[0009] Surprisingly it has been found that the substitution or deletion of the cysteine, which normally effects the 
20 dimerization in the proteins, results upon expression and correct folding (proper formation of the intramolecular disulfide 
bridges) in a monomeric protein that retains the biological activity of the dimeric form. Even more surprisingly, it was 
found that at least some of the monomeric proteins show a higher activity, based on the weight of protein, than their 
respective dimeric forms. Apart from this improved biological activity an important advantage for the proteins according 
to the invention is that they can be expressed in a large amount in prokaryotic hosts and upon simple refolding of the 
25 monomers they are obtained in high purity and very high yield without the need to separate dimerized from non-dimer- 
ized (monomeric) protein. The findings of the present invention are very surprising since, as already mentioned above, 
it was common understanding that only a dimer of the morphogenetic proteins has biological activity. Despite this 
understanding the proteins according to the invention show an up to two-fold higher activity than that of the dimer on 
the basis of protein weight. The smaller size of the proteins of the invention, while maintaining the biological activity, 
30 can also be considered as advantageous, e.g. for applications concerning the brain since the monomeric protein can 
much easier pass the blood-brain-barrier than the dimeric form. 

[0010] The proteins according to the invention encompass all proteins of the mentioned protein families that are 
normally present in dimeric form. Also parts of such proteins that retain substantial activity or fusion proteins or precursor 
forms of proteins shall be considered as encompassed by the present invention as well as biologically active naturally 
35 occurring or biosynthetic variants of TGF-p superfamily proteins, as long as they show at least considerable biological 
activity. 

[0011] In a preferred embodiment of the present invention the monomeric protein is a mature protein or a biologically 
active part or variant thereof. The term "biologically active part or variant thereof" is meant to define either fragments 
retaining activity, precursor proteins that are e.g. cleaved at the site of activity to the mature form or show biological 

40 activity themselves, or also variants that still maintain essentially the biological activity of the wild-type protein. Such 
variants preferably contain conservative amino acid substitutions, but especially at the N-terminal part of the mature 
proteins even considerable deletions or substitutions do not lead to a considerable loss of biological activity. It is well 
within the skill of the man in the art to determine whether a certain protein shows the required biological activity Proteins 
showing at least 70% and preferably at least 80% homology to the mature wild-type proteins of the above referenced 

45 protein families should be understood as encompassed by the present invention, as long as they contain the deletion 
or substitution of a cysteine, as required for the proteins according the invention, and therefore do not form dimers. 
[0012] It is especially preferred that proteins according to the invention contain at least the 7 cysteine region char- 
acteristic for the TGF-p protein superfamily. 

[0013] This specific 7 cysteine region is considered to be the most important part of the proteins in view of the 
50 biological activity. Therefore proteins retaining this critical region are preferred proteins according to the invention. It 
is disclosed in the state of the art which cysteine is responsible in a certain protein family or protein for dimer formation 
(see for example: Schlunegger & Grutter (1 992) Nature 358, 430-434; Daopin et al., (1 992) Science 257, 369-373 and 
Griffith etat., Proc. Natl. Acad. Sci. 93(1996), page 878-883). This cysteine has to be deleted or substituted by another 
amino acid to form a protein according to the invention. 
55 [0014] The 7 cysteine region is known for many proteins of the TGF-p protein superfamily. In this region the respective 
location of the cysteine residues to each other is important and is only allowed to vary slightly in order not to lose the 
biological activity. Consensus sequences for such proteins are known in the state of the art and all proteins complying 
with such consensus sequences are considered to be encompassed by the present invention. 
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[0015] In an especially preferred embodiment of the present invention the protein contains a consensus sequence 
according to the following sequence 

5 0 ( y )25-29 C Y Y Y C (Y) 25 . 35 X C (Y) 27 . 34 CYC (Formula I), 

wherein C denotes cysteine, Y denotes any amino acid including cysteine and X denotes any amino acid except 
cysteine. 

[0016] More preferably the protein according to the invention contains a consensus sequence according to the fol- 
10 lowing sequence 

C (Y) 28 C Y Y Y C (Y) 30 _ 32 X C (Y) 31 CYC (Formula II), 

15 wherein C, X and Y have the same meaning as defined above. 

[001 7] Even more preferably the protein according to the invention contains a consensus sequence according to the 
following sequence 

20 C (X) 28 C X X X C (X) 31 _ 33 C (X) 31 C X C (Formula III), 

wherein C and X have the same meaning as defined above. 

[0018] In these consensus sequences especially preferred distances between the respective cysteine residues are 
contained, wherein also already the dimer forming cysteine is substituted by another amino acid. As with all proteins 
25 of said protein superfamily the location of and distance between the cysteines is more important than the identity of 
the other amino acids contained in this region. Therefore, the consensus sequence shows the respective location of 
the cysteines, but does not show the identity of the other amino acids, since these other amino acids are widely variable 
in the proteins of the TGF-p protein superfamily 

[0019] In a preferred embodiment of the present invention the monomeric protein according to the invention is a 

30 morphogenetic protein. 

[0020] Most of the members of the TGF-p protein superfamily are morphogenetic proteins that are useful for treat- 
ments where regulation of differentiation and proliferation of cells or progenitor cells is of interest. This can result in 
replacement of damaged and/or diseased tissue like for example skeletal (bone, cartilage) tissue, connective tissue, 
periodontal or dental tissue, neural tissue, tissue of the sensory system, liver, pancreas, cardiac, blood vessel and 

35 renal tissue, uterine or thyroid tissue etc. Morphogenetic proteins are often useful for the treatment of ulcerative or 
inflammatory tissue damage and wound healing of any kind such as enhanced healing of ulcers, burns, injuries or skin 
grafts. Especially preferred proteins according to the invention belong to the TGF-p, BMP, GDF, activin or GDNF fam- 
ilies. Several BMP proteins which were originally discovered by their ability to induce bone formation, have been de- 
scribed, as also indicated above. Meanwhile, several additional functions have been found as it is also true for members 

40 

of the GDFs. These proteins show a very broad field of applications and especially are in addition to their bone and 
cartilage growth promoting activity (see for example: WO 88/00205, WO 90/11 366, WO 91/05802) useful in periodontal 
disease, for inhibiting periodontal and tooth tissue loss, for sealing tooth cavities, for enhancing integration of a tooth 
in a tooth socket (see for example: WO 96/26737, WO 94/06399, WO 95/2421 0), for connective tissue such as tendon 
or ligament (see for example: WO 95/16035), for improving survival of neural cells, for inducing growth of neural cells 

45 and repairing neural defects, for damaged CNS tissue due to stroke or trauma (see for example: WO 97/34626, WO 
94/03200, WO 95/05846), for maintaining or restoring sensory perception (see for example WO 98/20890, WO 
98/20889), for renal failure (see for example: WO 97/41880, WO 97/41881), for liver regeneration (see for example 
WO 94/06449), for regeneration of myocardium (see for example WO 98/27995), for treatment or preservation of 
tissues or cells for organ or tissue transplantation, for integrity of gastrointestinal lining (see for example WO 94/06420), 

50 for increasing progenitor cell population as for example hematopoietic progenitor cells by ex vivo stimulation (see for 
example WO 92/15323), etc. One preferred member of the GDF family is the protein MP52 which is also termed GDF- 
5 or CDMP-1. Applications for MP52 reflect several of the already described applications for the BMP/GDF family. 
MP52 is considered to be a very effective promoter of bone and cartilage formation as well as connective tissue for- 
mation (see for example WO 95/04819, Hotten et al., (1996), Growth Factors 13, 65-74, Storm et al., (1994) Nature 

55 368, 639-643, Chang et al., (1994) J. Biol. Chem. 269 (45), 28227-28234). In this connection MP52 is useful for ap- 
plications concerning the joints between skeletal elements (see for example Storm & Kingsley (1996) Development 
122, 3969-3979). One example for connective tissue is tendon and ligament (Wolfman et al., (1997), J. Clin. Invest. 
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100, 321-330, Aspenberg & Forslund (1999), Acta Orthop Scand 70, 51-54, WO 95/16035). MP52 is also useful for 
tooth (dental and periodontal) applications (see for example WO 95/04819, WO 93/16099, Morotome et al. (1998), 
Biochem Biophys Res Comm 244, 85-90). MP52 is useful in wound repair of any kind. It is in addition very useful for 
promoting tissue growth in the neuronal system and survival of dopaminergic neurons, for example. MP52 in this 

s connection is useful for applications in neurodegenerative diseases like e.g. Parkinson's disease and possibly also 
Alzheimer's disease for Huntington chorea tissues (see for example WO 97/031 88, Krieglstein et al., (1 995) J. Neurosci 
Res. 42, 724-732, Sullivan etal., (1997) Neurosci Lett 233, 73-76, Sullivan etal. (1998), Eur. J. Neurosci 10, 3681-3688). 
MP52 allows to maintain nervous function or to retain nervous function in already damaged tissues. MP52 is therefore 
considered to be a generally applicable neurotrophic factor. It is also useful for diseases of the eye, in particular retina 

10 cornea and optic nerve (see for example WO 97/03188, You et al. (1999), Invest Opthalmol Vis Sci 40, 296-311). The 
monomeric MP52 is expected to show all the already described activities of the dimeric form as well as some further 
described activities as described for the dimeric BMP/GDF family members. It is expected to be for example also useful 
for increasing progenitor cell populations and for stimulating differentiation of progenitor cells ex vivo. Progenitor cells 
can be cells which take part in the cartilage formation process or hematopoietic progenitor cells. It is also useful for 

15 damaged or diseased tissue where a stimulation of angiogenesis is advantageous (see for example: Yamashita et al. 
(1997), Exp Cell Res 235, 218-226). 

[0021] An especially preferred protein according to the invention therefore is protein MP52 or a biologically active 
part or variant thereof. Like in the already above mentioned definition of these terms MP52 can e.g. be used in its 
mature form, however, it can also be used as a fragment thereof at least containing the 7 cysteine region or also in a 

20 precursory form. Deviations at the N-terminal part of mature MP52 do not affect its activity to a considerable degree. 
Therefore, substitutions, deletions or additions on the N-terminal part of the proteins are still within the scope of the 
present invention. It might be useful to add a peptide to the N-terminal part of the protein, e.g. for purification reasons. 
It might not be necessary to cleave off this added peptide after expression and purification of the protein. Additional 
peptides at the N- or C-terminal part of the protein may also serve for the targeting of the protein to special tissues 

25 such as nerve or bone tissue or for the penetration of the blood/brain barrier. Generally, also fusion proteins of a 
monomeric protein according to the invention and another peptide or group are considered within the scope of the 
present invention, wherein these other peptides or groups are directing the localization of the fusion protein, e.g. be- 
cause of an affinity to a certain tissue type etc. Examples for such fusion proteins are described in WO 97/23612. The 
protein containing such addition will retain its biological activity at least as long as such addition does not impair the 

30 formation of the biologically active conformation of the protein. 

[0022] In an especially preferred embodiment of the present invention the proteins comprises the amino acid se- 
quence according to SEQ.ID.NO.1 (DNA and protein sequence) and SEQ.ID.No.2 (protein sequence, only), respec- 
tively. SEQ.ID.NO.2 shows the complete protein sequence of the prepro protein of human MP52, as already disclosed 
in WO 95/0481 9. The start of the mature protein lies preferably in the area of amino acids 352-400, especially preferred 

35 at amino acids 381 or 382. Therefore, the mature protein comprises amino acids 381 -501 or 382-501 . The first alanine 
of the mature protein can be deleted and the mature protein then preferably comprises amino acids 383-501. The 
cysteine at position 465 that is present in the already described dimeric MP52 protein is according to the invention 
either deleted or substituted by another amino acid. This deletion or substitution is represented by Xaa at the respective 
position in SEQ.ID.Nos.1 and 2. 

40 [0023] The activin/inhibin family proteins are of interest for applications related to contraception, fertility and preg- 
nancy (see for example WO 94/1 9455, U.S. 5, 1 02,868). They are also of interest for applications like repair or preven- 
tion of diseases of the nervous system, they can be used in the repair of organ tissue such as liver and even in bone 
and cartilage, too. In this connection MP121 (activin PC) is especially useful in applications for growth or regeneration 
of damaged and/or diseased tissue, especially the liver tissue, neural tissue, skeletal tissue (see for example WO 

45 96/01 31 6, WO 98/22492 and WO 97/031 88). MP1 21 is known to be predominantly expressed in the liver whereby the 
mRNA is markedly reduced after partial hepatectomy. MP121 is expected to regulate the liver mass (Zhang et al., 
Endocrine Journal 44 (1 997), page 759-764). The monomeric MP1 21 shows all the already described activities of the 
dimeric form as well as some further described activities as described for the dimeric TGF-p superfamily members. It 
is for example also expected to be useful in treatment of ulceration (for example stomach ulceration) and useful for 

50 integrity of gastrointestinal lining and for stimulating differentiation of progenitor cells ex vivo, treatment or preservation 
of mammalian tissue or cells, e.g. for organ or tissue transplantation. 

[0024] A further preferred protein according to the invention therefore is MP121, a member of the activin/inhibin 
protein family. Also for this protein a biologically active part or variant thereof is encompassed by the present invention 
according to the above defined rules. An especially preferred embodiment is shown in SEQ.I D. NO. 3 (DNA and protein 
55 sequence) and SEQ.ID.NO.4 (protein sequence, only) respectively. SEQ.ID.NO.4 shows the complete amino acid 
sequence of the prepro protein of human MP1 21, that has already been disclosed in WO 96/01316. The start of the 
mature protein lies preferably between amino acids 217 and 247, most preferred at amino acid 237. A preferred mature 
protein therefore comprises the mature part of the protein starting at amino acid 237 and ending at amino acid 352. 
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However, also the precursor protein comprising the whole shown amino acid sequence is encompassed by the present 
invention. The cysteine at position 316 is according to the invention either deleted or substituted by another amino 
acid, being represented by Xaa in SEQ.ID.Nos.3 and 4. 

[0025] The amino acid by which the cysteine residue effecting the dimerization is substitued can be selected by any 
s amino acid that does not impairthe formation of a biologically active conformation. The amino acid is preferably selected 
from the group of alanine, serine, threonine, leucine, isoleucine, glycine and valine. 

[0026] The proteins according to the invention are in summary characterized by the absence of the cysteine residue 
in the amino acid sequence responsible for the dimer formation. This absence can be effected by substitution of this 
cysteine by another amino acid or by deletion. In case of deletion, however, it must be assured for the protein that the 
10 formation of the biologically active conformation is not hindered. The same is true for the selection of the substitution 
amino acid, wherein it is preferred to use an amino acid which has a form similar to cysteine. 

[0027] The monomeric proteins according to the invention can be easily produced, in particular by expression in 
prokaryots and renaturation according to known methods. It is advantageous that the protein can be obtained in ex- 
ceedingly biologically active form. The proteins exhibit in monomeric form about the same activity as the dimer so that 
15 based on the amount of active substance only half of the monomeric protein has to be used in order to obtain the same 
positive biological effects. 

[0028] A further subject matter of the present invention is a nucleic acid encoding a protein according to the invention. 
It is obvious that the nucleic acid has to have such a sequence that a deletion or substitution of the cysteine responsible 
for the dimer formation is achieved. The nucleic acid can be a naturally occurring nucleic acid, but also a recombinantly 

20 produced or processed nucleic acid. The nucleic acid can be both a DNA sequence and an RNA sequence, as long 
as the protein according to the invention can be obtained from this nucleic acid upon expression in a suitable system. 
[0029] In a preferred embodiment of the invention the nucleic acid is a DNA sequence. This DNA sequence in an 
especially preferred embodiment of the invention comprises a sequence as shown in SEQ.ID.NO.1 and SEQ.ID.NO. 
3, respectively, or parts thereof. SEQ.ID.NO.1 shows a nucleic acid encoding MP52, wherein the codon for the cysteine 

25 responsible for the dimer formation is replaced by another codon which does not encode cysteine or deleted. This 
substitution or deletion is shown as "nnn" in the sequence protocols. SEQ.ID.NO. 3 shows a nucleic acid encoding 
MP121, wherein also the codon for the cysteine effecting the dimer formation is replaced by a respective different 
codon or deleted. Instead of the complete sequences of SEQ.ID.NOs.1 or 3 also parts can be used that encode the 
mature proteins or fragments also described above. 

30 [0030] It is preferred in the framework of the present invention that the nucleic acid apart from the coding sequences 
also contains expression control sequences. Such expression control sequences are known to the man skilled in the 
art and serve to control the expression of the encoded protein in a host cell. The host cell does not have to be an 
isolated cell, moreover, the nucleic acid can be expressed in the patient in vivo in the target tissue. This can be done 
by inserting the nucleic acid into the cell genome, however, it is also possible to transform host cells with expression 

35 vectors containing a nucleic acid according to the invention. Such expression vectors are a further subject matter of 
the present invention, wherein the nucleic acid is inserted in a suitable vector system, the vector system being selected 
according to the desired expression of the protein. The vector system can be a eukaryotic vector system, but - in the 
framework of the present invention - it is preferably a prokaryotic vector system, with which the proteins can be produced 
in prokaryotic host cells in a particularly easy and pure manner. In addition, the expression vector can be a viral vector. 

40 [0031] Also host cells in turn are a further subject matter of the present invention. The host cells are characterized 
in that they contain a nucleic acid according to the invention or an expression vector according to the invention and 
that they are able to use the information present in the nucleic acids and in the expression vector, respectively, for the 
expression of a monomeric protein according to the invention. 

[0032] Although in the framework of the present invention also eukaryotic host cells are suitable for the production 
45 of the protein, it is, as mentioned already several times above, particularly advantageous that the protein according to 
the invention can be produced in prokaryotic host cells, which therefore represent a preferred embodiment of the 
present invention. 

[0033] After such preferred expression in prokaryotic host cells the protein is purified and renatured according to 
known methods, thereby effecting intramolecular cystine bridge formation. 
50 [0034] Since, however, not only in vitro production of the monomeric protein is possible, but also in vivo expression 
of a nucleic acid according to the invention, a further preferred embodiment is a eukaryotic host cell, and especially a 
eukaryotic host cell containing the DNA in its genome, or as an expression vector. Such host cell can also be useful 
for application to an individual in need of morphogenic treatment. 

[0035] Further subject matters of the present application are pharmaceutical compositions comprising at least one 
55 monomeric protein according to the invention or at least one nucleic acid encoding for such a protein or at least one 
corresponding expression vector, or at least one eukaryotic host cell expressing the monomeric protein. 
[0036] The protein itself, but also a nucleic acid according to the invention, an expression vector or a host cell can 
be considered to be advantageous as active substances in a pharmaceutical composition. Also combinations of mon- 
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omeric proteins, with either biological activities in the same or different applications, can be used in preferred pharma- 
ceutical compositions. Especially preferred for neuronal applications are combinations of MP52 with other TGF-p su- 
perfamily proteins, both in monomericform, like for example with GDNF (see WO 97/03188). Also preferred for neuronal 
applications are combinations of TGE-|3 with GDNF, both in monomeric form. Also for applications concerning cartilage 

s and/or bone the combination of several monomeric proteins might be useful, like MP52 with a protein of TGF-p (see 
e.g. WO 92/09697) or MP52 with a cartilage maintenance-inducing protein such as BMP-9 (see e.g. WO 96/39170). 
When a nucleic acid or an expression vector is used, however, it has to be ensured that when administering to the 
patient there has to be an environment in which the nucleic acid and the expression vector, respectively, can be ex- 
pressed and the protein according to the invention can be produced in vivo at the site of action. The same applies 

10 accordingly to the host cell according to the invention. When using expression vectors or host cells it is also possible 
that they encode more than one monomeric protein of the invention to produce a combination of two or more monomeric 
proteins. 

[0037] It is advantageous to both the protein and the nucleic acid or the expression vector or the host cell when they 
are applied in and/or on a biocompatible matrix. The matrix material can be transplanted into the patient, e.g. surgically, 

is wherein the protein either is effective on the surface of the matrix material or the protein or the DNA encoding the 
protein can be slowly released from the matrix material and then be effective over a long period of time. Additionally 
it is possible and advantageous to use a biodegradable matrix material in the pharmaceutical composition, wherein 
this material preferably dissolves during the protein induced tissue formation so that a protein or a nucleic acid contained 
therein is released and the newly formed tissue replaces the matrix material. 

20 [0038] Finally, in case of applications relating to bone formation, it is advantageous to use a matrix material which 
is itself e.g. osteogenically active. By using such a matrix material it becomes possible to achieve a synergistic effect 
of protein and matrix material and to effect a particularly rapid and effective bone formation. 

[0039] An especially preferred matrix material that can be used according to the invention is a matrix material as 
described in U.S. 5,231 ,169 and U.S. 5,776,193 and especially for applications like spinal fusion. 

25 [0040] When using a combination of a matrix material and protein and/or nucleic acid and/or expression vector, it is 
preferable to sterilize such a combination prior to its use. The matrix and the morphogenetic protein can be separately 
sterilized and then combined, but it is preferred to terminally sterilize the device consisting of matrix and morphogenetic 
protein. Terminal sterilization can be achieved by ionizing radation as already described for dimeric proteins (U.S. 
5,674,292) but it is also advantageous to use ethylene oxide. 

30 [0041] Of course this invention also comprises pharmaceutical compositions containing further substances like e.g. 
pharmacologically acceptable auxiliary and carrier substances. However, the protein according to the invention, also 
in case a matrix material is used, does not necessarily have to be used together with this matrix material, but can also 
be administered systemically, wherein it concentrates preferably in the surrounding of an implanted matrix material. 
[0042] For some applications the protein according to the invention and the nucleic acid forming this protein, respec- 

35 tively or the expression vector or host cell can preferably be present in an injectable composition. Implants are not 
necessary or possible for every form of application of the proteins according to the invention. However, it is also possible 
to provide an implantable vessel or an implantable micropump containing for example semipermeable membranes in 
which the protein according to the invention or the nucleic acid generating it is contained, from which either one is 
slowly released over a prolonged period of time. The pharmaceutical composition according to the invention can also 

40 contain other vehicles which make it possible that the proteins or the nucleic acids or the expression vectors encoding 
these proteins be transported to the site of activity and released there, wherein e.g. liposomes or nanospheres can be 
used. In principle, it is also possible to apply host cells, like e.g. implanted embryonic cells expressing the proteins. 
Cells transfected with recombinant DNA may be encapusled prior to implantation. Any other practicable but herein not 
explicitly described form of application of the pharmaceutical composition according the invention and their correspond- 

45 jng manufacture are also comprised by the present invention, as long as they contain a protein according to the invention 
or a nucleic acid or an expression vector coding therefor, or a host cell expressing it. 

[0043] Although the indications shall not be restricted herein and all indications exhibiting the dimeric form of the 
protein according to the invention are also comprised, in the following types of application for the compositions accord- 
ing to the invention are listed which are considered to be particularly preferred indications for the proteins of the present 

so invention. On the one hand, there is the prevention or therapy of diseases associated with bone and/or cartilage damage 
or affecting bone and/or cartilage disease, or generally situations, in which cartilage and/or bone formation is desirable 
or for spinal fusion, and on the other hand, there is prevention or therapy of damaged or diseased tissue associated 
with connective tissue including tendon and/or ligament, periodontal or dental tissue including dental implants, neural 
tissue including CNS tissue and neuropathological situations, tissue of the sensory system, liver, pancreas, cardiac, 

ss blood vessel, renal, uterine and thyroid tissue, skin, mucous membranes, endothelium, epithelium, for promotion or 
induction of nerve growth, tissue regen ration, angiogenesis, wound healing including ulcers, burns, injuries or skin 
grafts, induction of proliferation of progenitor cells or bone marrow cells, for maintenance of a state of proliferation or 
differentiation for treatment or preservation of tissue or cells for organ or tissue transplantation, for integrity of gastroin- 
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testinal lining, for treatment of disturbances in fertility, contraception or pregnancy. 

[0044] Diseases concerning sensory organs like the eye are also to be included in the preferred indication of the 
pharmaceutical composition according to the invention. As neuronal diseases again Parkinson's and Alzheimer's dis- 
eases can be mentioned as examples. 

s [0045] The pharmaceutical compositions according to the invention can be used in any desired way, the pharma- 
ceutical compositions are formulated preferably for surgical local application, topical or systemic application. Auxiliary 
substances for the individual application form can of course be present in the pharmaceutical composition according 
to the invention. For some applications it can be advantageous to add some further substances to the pharmaceutical 
composition as for example Vitamin D (WO 92/21365), parathyroid hormone related peptide (WO 97/35607), chordin 

10 (WO 98/21 335), anti-fibrinolytic agent (EP 535091 ), anti-metabolites (WO 95/09004), alkyl cellulose (WO 93/00050), 
mannitol (WO 98/33514), sugar, glycine, glutamic acid hydrochloride (U.S. 5,385,887), antibiotics, antiseptics, amino 
acids and/or additives which improve the solubility or stablility of the monomeric morphogenetic protein as for example 
nonionic detergents (e.g. Tween 80), basic amino acids, carrier proteins (e.g. serum albumin), full length propeptides 
of the TGF-p superfamily or parts thereof. 

is [0046] As can be already gathered from the description of proteins, nucleic acids and pharmaceutical compositions, 
the proteins according to the invention and respective nucleic acids, which provide for an expression of the proteins 
at the site of activity, can advantageously be applied in all areas for which also the dimeric forms of the proteins, as 
described, can be applied. In the framework of the present invention therefore a further subject matter is the use of a 
pharmaceutical composition according to the present invention for the treatment or prevention of any indications of the 

20 dimeric forms of the proteins according to the invention. 

[0047] Herein it is again possible to conduct surgical operations and to implant the pharmaceutical composition (in 
particular contained on a matrix material), an administration in liquid or otherwise suitable form via, e.g. injection or 
oral administration seems to be as suitable as a topical application for e.g. tissue regeneration. 
[0048] Fig. 1 A shows a two dimensional graph of the conformation of recombinantly produced dimeric MP52 with 

25 the deleted first alanine. In this figure the 7 cysteine bridges contained in a dimer are shown, wherein there are 3 
intramolecular cystine bridges per monomer unit and 1 intermolecular cystine brigde connecting both monomers. Fig. 
1 B shows the monomeric protein according to the invention wherein the cysteine of the amino acid sequence of MP52 
has been replaced by X that denotes any amino acid except cysteine. 

30 
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SEQUENCE LISTING 

<110> HyGene AG 

<120> Monomeric Protein of the TGF-beta Family 

<130> 20780PEP Monomeric TGF-beta protein 

<140> 
<141> 

<160> 4 

<170> Patentln Ver. 2.1 

<210> 1 

<211> 2703 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (640) . . (2142) 
<400> 1 

ccatggcctc gaaagggcag cggtgatttt tttcacataa atatatcgca cttaaatgag 6 0 
tttagacagc atgacatcag agagtaatta aattggtttg ggttggaatt ccgtttccaa 120 
ttcctgagtt caggtttgta aaagattttt ctgagcacct gcaggcctgt gagtgtgtgt 18 0 
gtgtgtgtgt gtgtgtgtgt gtgtgtgtga agtattttca ctggaaagga ttcaaaacta 240 
gggggaaaaa aaaactggag cacacaggca gcattacgcc attcttcctt cttggaaaaa 300 
tccctcagcc ttatacaagc ctccttcaag ccctcagtca gttgtgcagg agaaaggggg 360 



EP 1 074 620 A1 

cggttggctt tctcctttca agaacgagtt attttcagct gctgactgga gacggtgcac 420 
gtctggatac gagagcattt ccactatggg actggataca aacacacacc cggcagactt 480 
caagagtctc agactgagga gaaagccttt ccttctgctg ctactgctgc tgccgctgct 540 
tttgaaagtc cactcctttc atggtttttc ctgccaaacc agaggcacct ttgctgctgc 600 

cgctgttctc tttggtgtca ttcagcggct ggccagagg atg aga etc ccc aaa 654 

Met Arg Leu Pro Lys 
1 5 

etc etc act ttc ttg ctt tgg tac ctg get tgg ctg gac ctg gaa ttc 702 

Leu Leu Thr Phe Leu Leu Trp Tyr Leu Ala Trp Leu Asp Leu Glu Phe 
10 15 20 

ate tgc act gtg ttg ggt gec cct gac ttg ggc cag aga ccc cag ggg 75 0 

lie Cys Thr Val Leu Gly Ala Pro Asp Leu Gly Gin Arg Pro Gin Gly 
25 30 35 

acc agg cca gga ttg gec aaa gca gag gee aag gag agg ccc ccc ctg 798 

Thr Arg Pro Gly Leu Ala Lys Ala Glu Ala Lys Glu Arg Pro Pro Leu 
40 45 50 

gee egg aac gtc ttc agg cca ggg ggt cac age tat ggt ggg ggg gec 846 

Ala Arg Asn Val Phe Arg Pro Gly Gly His Ser Tyr Gly Gly Gly Ala 
55 60 65 

acc aat gec aat gee agg gca aag gga ggc acc ggg cag aca gga ggc 894 

Thr Asn Ala Asn Ala Arg Ala Lys Gly Gly Thr Gly Gin Thr Gly Gly 
70 75 80 85 

ctg aca cag ccc aag aag gat gaa ccc aaa aag ctg ccc ccc aga ccg 942 

Leu Thr Gin Pro Lys Lys Asp Glu Pro Lys Lys Leu Pro Pro Arg Pro 
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90 95 100 

99 c 99 c cct 9 aa ccc aa 9 cca 99 a cac cct ccc caa aca a 99 cag get 9 90 

Gly Gly Pro Glu Pro Lys Pro Gly His Pro Pro Gin Thr Arg Gin Ala 
105 110 115 

aca gec egg act gtg acc cca aaa gga cag ctt ccc gga ggc aag gca 103 8 

Thr Ala Arg Thr Val Thr Pro Lys Gly Gin Leu Pro Gly Gly Lys Ala 
120 125 130 

ccc cca aaa gca gga tct gtc ccc age tec ttc ctg ctg aag aag gee 1086 

Pro Pro Lys Ala Gly Ser Val Pro Ser Ser Phe Leu Leu Lys Lys Ala 
135 140 145 

agg gag ccc ggg ccc cca cga gag ccc aag gag ccg ttt cgc cca ccc 1134 

Arg Glu Pro Gly Pro Pro Arg Glu Pro Lys Glu Pro Phe Arg Pro Pro 
150 155 160 165 

ccc ate aca ccc cac gag tac atg etc teg ctg tac agg acg ctg tec 1182 

Pro lie Thr Pro His Glu Tyr Met Leu Ser Leu Tyr Arg Thr Leu Ser 
170 175 180 

gat get gac aga aag gga ggc aac age age gtg aag ttg gag get ggc 12 3 0 

Asp Ala Asp Arg Lys Gly Gly Asn Ser Ser Val Lys Leu Glu Ala Gly 
185 190 195 

ctg gec aac acc ate acc age ttt att gac aaa ggg caa gat gac cga 1278 

Leu Ala Asn Thr He Thr Ser Phe He Asp Lys Gly Gin Asp Asp Arg 
200 205 210 

ggt ccc gtg gtc agg aag cag agg tac gtg ttt gac att agt gec ctg 1326 

Gly Pro Val Val Arg Lys Gin Arg Tyr Val Phe Asp He Ser Ala Leu 
215 220 225 

gag aag gat ggg ctg ctg ggg gee gag ctg egg ate ttg egg aag aag 13 74 

Glu Lys Asp Gly Leu Leu Gly Ala Glu Leu Arg He Leu Arg Lys Lys 
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230 235 240 245 

ccc teg gac acg gec aag cca gcg gec ccc gga ggc ggg egg get gee 1422 

Pro Ser Asp Thr Ala Lys Pro Ala Ala Pro Gly Gly Gly Arg Ala Ala 
250 255 260 

cag ctg aag ctg tec age tgc ccc age ggc egg cag ccg gee tec ttg 1470 

Gin Leu Lys Leu Ser Ser Cys Pro Ser Gly Arg Gin Pro Ala Ser Leu 
265 270 275 

ctg gat gtg cgc tec gtg cca ggc ctg gac gga tct ggc tgg gag gtg 1518 

Leu Asp Val Arg Ser Val Pro Gly Leu Asp Gly Ser Gly Trp Glu Val 
280 285 290 

ttc gac ate tgg aag etc ttc cga aac ttt aag aac teg gee cag ctg 1566 

Phe Asp lie Trp Lys Leu Phe Arg Asn Phe Lys Asn Ser Ala Gin Leu 
295 300 305 

tgc ctg gag ctg gag gec tgg gaa egg ggc agg gee gtg gac etc cgt 1614 

Cys Leu Glu Leu Glu Ala Trp Glu Arg Gly Arg Ala Val Asp Leu Arg 
310 315 320 325 

ggc ctg ggc ttc gac cgc gee gee egg cag gtc cac gag aag gee ctg 1662 

Gly Leu Gly Phe Asp Arg Ala Ala Arg Gin Val His Glu Lys Ala Leu 
330 335 340 

ttc ctg gtg ttt ggc cgc ace aag aaa egg gac ctg ttc ttt aat gag 1710 

Phe Leu Val Phe Gly Arg Thr Lys Lys Arg Asp Leu Phe Phe Asn Glu 
345 350 355 

att aag gee cgc tct ggc cag gac gat aag ace gtg tat gag tac ctg 1758 

lie Lys Ala Arg Ser Gly Gin Asp Asp Lys Thr Val Tyr Glu Tyr Leu 
360 365 370 

ttc age cag egg cga aaa egg egg gee cca ctg gee act cgc cag ggc 1806 

Phe Ser Gin Arg Arg Lys Arg Arg Ala Pro Leu Ala Thr Arg Gin Gly 
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375 380 385 

aag cga ccc age aag aac ctt aag get cgc tgc agt egg aag gca ctg 18 54 

Lys Arg Pro Ser Lys Asn Leu Lys Ala Arg Cys Ser Arg Lys Ala Leu 
390 395 400 405 

cat gtc aac ttc aag gac atg ggc tgg gac gac tgg ate ate gca ccc 19 02 

His Val Asn Phe Lys Asp Met Gly Trp Asp Asp Trp lie lie Ala Pro 
410 415 420 

ctt gag tac gag get ttc cac tgc gag ggg ctg tgc gag ttc cca ttg 1950 

Leu Glu Tyr Glu Ala Phe His Cys Glu Gly Leu Cys Glu Phe Pro Leu 
425 430 435 

cgc tec cac ctg gag ccc acg aat cat gca gtc ate cag ace ctg atg 19 98 

Arg Ser His Leu Glu Pro Thr Asn His Ala Val lie Gin Thr Leu Met 
440 445 450 

aac tec atg gac ccc gag tec aca cca ccc ace nnn tgt gtg ccc acg 2 04 6 

Asn Ser Met Asp Pro Glu Ser Thr Pro Pro Thr Xaa Cys Val Pro Thr 
455 460 465 

egg ctg agt ccc ate age ate etc ttc att gac tct gee aac aac gtg 2 0 94 

Arg Leu Ser Pro lie Ser lie Leu Phe lie Asp Ser Ala Asn Asn Val 
470 475 480 485 

gtg tat aag cag tat gag gac atg gtc gtg gag teg tgt ggc tgc agg 2142 

Val Tyr Lys Gin Tyr Glu Asp Met Val Val Glu Ser Cys Gly Cys Arg 
490 495 500 

tagcagcact ggccctctgt cttcctgggt ggcacatccc aagagcccct tcctgcactc 22 02 
ctggaatcac agaggggtca ggaagctgtg gcaggagcat ctacacagct tgggtgaaag 22 62 
gggattccaa taagcttget cgctctctga gtgtgacttg ggctaaaggc ccccttttat 2322 
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ccacaagttc ccctggctga ggattgctgc ccgtctgctg atgtgaccag tggcaggcac 2382 
aggtccaggg agacagactc tgaatgggac tgagtcccag gaaacagtgc tttccgatga 2442 
gactcagccc accatttctc ctcacctggg ccttctcagc ctctggactc tcctaagcac 2502 
ctctcaggag agccacaggt gccactgcct cctcaaatca catttgtgcc tggtgacttc 2562 
ctgtccctgg gacagttgag aagctgactg ggcaagagtg ggagagaaga ggagagggct 2622 
tggatagagt tgaggagtgt gaggctgtta gactgttaga tttaaatgta tattgatgag 2682 
ataaaaagca aaactgtgcc t 2703 



<210> 2 
<211> 501 
<212> PRT 

<213> Homo sapiens 



<400> 2 

Met Arg Leu Pro Lys Leu Leu Thr 
1 5 

Leu Asp Leu Glu Phe He Cys Thr 
20 

Gin Arg Pro Gin Gly Thr Arg Pro 
35 40 

Glu Arg Pro Pro Leu Ala Arg Asn 
50 55 

Tyr Gly Gly Gly Ala Thr Asn Ala 
65 70 



Phe Leu Leu Trp Tyr Leu Ala Trp 
10 15 

Val Leu Gly Ala Pro Asp Leu Gly 
25 30 

Gly Leu Ala Lys Ala Glu Ala Lys 
45 

Val Phe Arg Pro Gly Gly His Ser 
60 

Asn Ala Arg Ala Lys Gly Gly Thr 
75 80 



Gly Gin Thr Gly Gly Leu Thr Gin Pro Lys Lys Asp Glu Pro Lys Lys 
85 90 95 
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Leu Pro Pro Arg Pro Gly Gly Pro Glu Pro Lys Pro Gly His Pro Pro 
100 105 110 

Gin Thr Arg Gin Ala Thr Ala Arg Thr Val Thr Pro Lys Gly Gin Leu 
115 120 125 

Pro Gly Gly Lys Ala Pro Pro Lys Ala Gly Ser Val Pro Ser Ser Phe 
130 135 140 

Leu Leu Lys Lys Ala Arg Glu Pro Gly Pro Pro Arg Glu Pro Lys Glu 
145 150 155 160 

Pro Phe Arg Pro Pro Pro lie Thr Pro His Glu Tyr Met Leu Ser Leu 
165 170 175 

Tyr Arg Thr Leu Ser Asp Ala Asp Arg Lys Gly Gly Asn Ser Ser Val 
180 185 190 

Lys Leu Glu Ala Gly Leu Ala Asn Thr lie Thr Ser Phe He Asp Lys 
195 200 205 

Gly Gin Asp Asp Arg Gly Pro Val Val Arg Lys Gin Arg Tyr val Phe 
210 215 220 

Asp He Ser Ala Leu Glu Lys Asp Gly Leu Leu Gly Ala Glu Leu Arg 
225 230 235 240 

He Leu Arg Lys Lys Pro Ser Asp Thr Ala Lys Pro Ala Ala Pro Gly 
245 250 255 

Gly Gly Arg Ala Ala Gin Leu Lys Leu Ser Ser Cys Pro Ser Gly Arg 
260 265 270 

Gin Pro Ala Ser Leu Leu Asp Val Arg Ser Val Pro Gly Leu Asp Gly 
275 280 285 

Ser Gly Trp Glu Val Phe Asp He Trp Lys Leu Phe Arg Asn Phe Lys 
290 295 300 

50 Asn Ser Ala Gin Leu Cys Leu Glu Leu Glu Ala Trp Glu Arg Gly Arg 

305 310 315 320 

Ala Val Asp Leu Arg Gly Leu Gly Phe Asp Arg Ala Ala Arg Gin Val 
55 325 330 335 
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His Glu Lys Ala Leu Phe Leu Val Phe Gly Arg Thr Lys Lys Arg Asp 
340 345 350 

Leu Phe Phe Asn Glu lie Lys Ala Arg Ser Gly Gin Asp Asp Lys Thr 
355 360 365 

val Tyr Glu Tyr Leu Phe Ser Gin Arg Arg Lys Arg Arg Ala Pro Leu 
370 375 380 

Ala Thr Arg Gin Gly Lys Arg Pro Ser Lys Asn Leu Lys Ala Arg Cys 
385 390 395 400 

Ser Arg Lys Ala Leu His Val Asn Phe Lys Asp Met Gly Trp Asp Asp 
405 410 415 

Trp lie lie Ala Pro Leu Glu Tyr Glu Ala Phe His Cys Glu Gly Leu 
420 425 430 

Cys Glu Phe Pro Leu Arg Ser His Leu Glu Pro Thr Asn His Ala Val 
435 440 445 

lie Gin Thr Leu Met Asn Ser Met Asp Pro Glu Ser Thr Pro Pro Thr 
450 455 460 

Xaa Cys Val Pro Thr Arg Leu Ser Pro lie Ser lie Leu Phe lie Asp 
46 r " 470 475 480 

Ser Ala Asn Asn Val Val Tyr Lys Gin Tyr Glu Asp Met Val Val Glu 
485 490 495 

Ser Cys Gly Cys Arg 
500 



<210> 3 

<211> 2272 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (128) . . (1183) 

<400> 3 
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tac gcc atg aac ttc tgc ata ggg cag tgc cca eta cac ata gca ggc 98 5 

Tyr Ala Met Asn Phe Cys lie Gly Gin Cys Pro Leu His lie Ala Gly 
275 280 285 

atg cct ggt att get gcc tec ttt cac act gca gtg etc aat ctt etc 1033 

Met Pro Gly lie Ala Ala Ser Phe His Thr Ala Val Leu Asn Leu Leu 
290 295 300 

aag gcc aac aca get gca ggc ace act gga ggg ggc tea nnn tgt gta 1081 

Lys Ala Asn Thr Ala Ala Gly Thr Thr Gly Gly Gly Ser Xaa Cys Val 
305 310 315 

ccc acg gcc egg cgc ccc ctg tct ctg etc tat tat gac agg gac age 112 9 

Pro Thr Ala Arg Arg Pro Leu Ser Leu Leu Tyr Tyr Asp Arg Asp Ser 
320 325 330 

aac att gtc aag act gac ata cct gac atg gta gta gag gcc tgt ggg 1177 

Asn lie Val Lys Thr Asp lie Pro Asp Met Val Val Glu Ala Cys Gly 
335 340 345 350 

tgc agt tagtctatgt gtggtatggg cagcccaagg ttgcatggga aaacacgccc 1233 

Cys Ser 

ctacagaagt gcacttcctt gagaggaggg aatgacctca ttctctgtcc agaatgtgga 1293 
ctccctcttc ctgagcatct tatggaaatt accccacctt tgacttgaag aaaccttcat 1353 
ctaaagcaag tcactgtgcc atcttcctga ccactaccct ctttcctagg gcatagtcca 1413 
tcccgctagt ccatcccgct agccccactc cagggactca gacccatctc caaccatgag 1473 
caatgccatc tggttcccag gcaaagacac ccttagctca cctttaatag accccataac 1533 
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ccactatgcc ttcctgtcct ttctactcaa tggtccccac tccaagatga gttgacacaa 1593 
ccccttcccc caatttttgt ggatctccag agaggccctt ctttggattc accaaagttt 1653 
agatcactgc tgcccaaaat agaggcttac ctacccccct ctttgttgtg agcccctgtc 1713 
cttcttagtt gtccaggtga actactaaag ctctctttgc ataccttcat ccattttttg 1773 
tccttctctg cctttctcta tgcccttaag gggtgacttg cctgagctct atcacctgag 1833 
ctcccctgcc ctctggcttc ctgctgaggt cagggcattt cttatccctg ttccctctct 1893 
gtctaggtgt catggttctg tgtaactgtg gctattctgt gtccctacac tacctggcta 1953 
cccccttcca tggccccagc tctgcctaca ttctgatttt tttttttttt ttttttttga 2013 
aaagttaaaa attccttaat tttttattcc tggtaccact accacaattt acagggcaat 2073 
atacctgatg taatgaaaag aaaaagaaaa agacaaagct acaacagata aaagacctca 2133 
ggaatgtaca tctaattgac actacattgc attaatcaat agctgcactt tttgcaaact 2193 
gtggctatga cagtcctgaa caagaagggt ttcctgttta agctgcagta acttttctga 2253 
ctatggatca tcgttcctt 2272 

<210> 4 
<211> 352 
<212> PRT 

<213> Homo sapiens 
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<400> 4 

Met Thr Ser Ser Leu Leu Leu Ala Phe Leu Leu Leu Ala Pro Thr Thr 
15 10 15 



Val Ala Thr Pro Arg Ala Gly Gly Gin Cys Pro Ala Cys Gly Gly Pro 
20 25 30 

Thr Leu Glu Leu Glu Ser Gin Arg Glu Leu Leu Leu Asp Leu Ala Lys 
35 40 45 

Arg Ser lie Leu Asp Lys Leu His Leu Thr Gin Arg Pro Thr Leu Asn 
50 55 60 

Arg Pro Val Ser Arg Ala Ala Leu Arg Thr Ala Leu Gin His Leu His 
65 70 75 80 



Gly Val Pro Gin 



Glu He He Ser 
100 

Arg Leu Asp Phe 
115 

Val Gin Gin Ala 

130 

Thr Trp Thr Leu 
145 

Asn Leu Thr Leu 



Trp His Gin Leu 
180 

Gly His Leu Thr 
195 

Ser Val He Leu 
210 

Val Arg Val Gly 
225 



Gly Ala Leu Leu 
85 

Phe Ala Glu Thr 



His Phe Ser Ser 
120 

Ser Leu Met Phe 
135 

Lys Val Arg Val 
150 

Ala Thr Gin Tyr 
165 

Pro Leu Gly Pro 



Leu Glu Leu Val 
200 

Gly Gly Ala Ala 
215 

Gly Lys His Gin 
230 



Glu Asp Asn Arg 
90 

Gly Leu Ser Thr 
105 

Asp Arg Thr Ala 



Phe Val Gin Leu 
140 

Leu Val Leu Gly 
155 

Leu Leu Glu Val 
170 

Glu Ala Gin Ala 
185 

Leu Glu Gly Gin 



His Arg Pro Phe 
220 

He His Arg Arg 
235 



Glu Gin Glu Cys 
95 

He Asn Gin Thr 
110 

Gly Asp Arg Glu 
125 

Pro Ser Asn Thr 



Pro His Asn Thr 
160 

Asp Ala Ser Gly 
175 

Ala Cys Ser Gin 
190 

Val Ala Gin Ser 
205 

Val Ala Ala Arg 



Gly He Asp Cys 
240 
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Gin Gly Gly Ser Arg Met Cys Cys Arg Gin Glu Phe Phe Val Asp Phe 
245 250 255 

5 

Arg Glu lie Gly Trp His Asp Trp lie lie Gin Pro Glu Gly Tyr Ala 
260 265 270 

Met Asn Phe Cys lie Gly Gin Cys Pro Leu His lie Ala Gly Met Pro 
275 280 285 

Gly lie Ala Ala Ser Phe His Thr Ala Val Leu Asn Leu Leu Lys Ala 
15 290 295 300 

Asn Thr Ala Ala Gly Thr Thr Gly Gly Gly Ser Xaa Cys Val Pro Thr 
305 310 315 320 

20 

Ala Arg Arg Pro Leu Ser Leu Leu Tyr Tyr Asp Arg Asp Ser Asn He 
325 330 335 

25 Val Lys Thr Asp He Pro Asp Met Val Val Glu Ala Cys Gly Cys Ser 

340 345 350 



30 



Claims 



1. Protein selected from the members of the TGF-p superfamily, 

characterized in that the protein is necessarily monomeric due to substitution or deletion of a cysteine which is 
35 responsible for dimer formation. 

2. Protein according to claim 1 , 

characterized in that the protein is a mature protein or a biologically active part or variant thereof. 

40 3. Protein according to any one of the preceeding claims, 

characterized in that the protein contains at least the 7 cysteine region characteristic for the TGF-p protein super- 
family. 

4. Protein according to claim 3, 

45 characterized in that it contains a consensus sequence according to Formula I: C(Y) 2 5_29CYYYC(Y)25.3 5 XC 

(Y) 2 7-3 4 CYC or Formula II: C(Y)2 8 CYYYC(Y)3o.32XC(Y) 31 CYC ) wherein C denotes cysteine, Y denotes any amino 
acid and X denotes any amino acid except cysteine. 

5. Protein according to any one of claims 1 to 4, 

so characterized in that the protein is a morphogenetic protein. 

6. Protein according to any one of the preceeding claims, 

characterized in that the proteins belongs to the TGF-p, BMP, GDF, activin or GDNF family. 

55 7. Protein according to claim 6, 

characterized in that the protein is MP52 (GDF5) or a biologically active part or variant thereof. 

8. Protein according to any one of the preceeding claims, 
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characterized in that it comprises the amino acid sequence according to SEQ.ID.NO.2 or a part thereof. 

9. Protein according to claim 6, 

characterized in that the protein is MP121 or a biologically active part or variant thereof. 

5 

10. Protein according to claim 9, 

characterized in that it comprises the amino acid sequence according to SEQ.ID.NO.4 or a part thereof. 

11. Protein according to any one of claims 1 to 10, 

10 characterized in that the cysteine residue is substituted by an amino acid selected from the group of alanine, serine, 

threonine, leucine, isoleucine, glycine and valine. 

12. Protein according to any one of claims 1 to 1 1 , 

characterized in that it contains additional amino acids that facilitate or mediate the transfer and localization of the 
15 protein in a certain tissue. 

13. Nucleic acid, 

characterized in that it encodes a protein according to any one of claims 1 to 12. 

20 14. Nucleic acid according to claim 13, 
characterized in that it is a DNA. 

15. Nucleic acid according to claim 1 3 or 14, 

characterized in that it contains a sequence as shown in SEQ.ID.NO.1 or a fragment thereof. 

25 

16. Nucleic acid according to claim 1 3 or 14, 

characterized in that it contains a sequence as shown in SEQ.ID.NO.3 or a fragment thereof. 

17. Nucleic acid according to any one of claims 13 to 16, 

30 characterized in that it further contains suitable expression control sequences facilitating and/or driving expression 

of the encoded protein. 

18. Expression vector, 

characterized in that it contains a nucleic acid according to any one of claims 1 3 to 1 7 in a suitable vector system. 

35 

19. Expression vector according to claim 18, 

characterized in that the vector system is suitable for prokaryotic expression. 

20. Host cell, 

40 characterized in that it contains a nucleic acid according to any one of claims 13 to 17 or an expression vector 

according to claims 18 or 19 and upon expression of said nucleic acid or vector is able to produce a monomeric 
protein according to any one of claims 1 to 12. 

21. Host cell according to claim 20, 

45 characterized in that it is a prokaryotic host cell. 

22. Host cell according to claim 20, 
characterized in that it is an embryonal cell. 

so 23. Pharmaceutical composition, 

characterized in that it contains at least one protein according to any one of claims 1 to 12 or at least one nucleic 
acid according to any one of claims 1 3 to 1 7, at least one expression vector according to any one of claims 1 8 or 
1 9 or at least one host cell according to claim 20 or 22. 

55 24. Pharmaceutical composition according to claim 23, 

characterized in that the protein and/or nucleic acid are contained in and/or on a biocompatible matrix material. 

25. Pharmaceutical composition according to claim 24, 
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characterized in that the matrix material is biodegradable. 

26. Pharmaceutical composition according to claims 24 or 25, 
characterized in that the matrix material is itself osteogen ically active. 

27. Pharmaceutical compostion according to any one of claims 23 to 26, 

for the prevention or therapy of diseases for which also the dimeric form of the protein would be indicated. 

28. Pharmaceutical composition according to claim 27, 

for prevention or therapy of diseases associated with bone and/or cartilage damage or affecting bone and/or car- 
tilage disease or situations in which cartilage and/or bone growth is desirable or for spinal fusion. 

29. Pharmaceutical composition according to claim 27, 

for prevention or therapy of damaged or diseased tissue associated with connective tissue including tendon and/ 
or ligament, periodontal or dental tissue including dental implants, neural tissue including CNS tissue and neu- 
ropahtological situations, tissue of the sensory system, liver, pancreas, cardiac, blood vessel, renal, uterine and 
thyroid tissue, skin, mucous membranes, endothelium, epithelium, for promotion or induction of nerve growth, 
tissue regeneration, angiogenesis, wound healing including ulcers, burns, injuries or skin grafts, induction of pro- 
liferation of progenitor cells or bone marrow cells, for maintenance of a state of proliferation or differentiation, for 
treatment or preservation of tissue or cells for organ or tissue transplantation, for integrity of gastrointestinal lining, 
for treatment of disturbances in fertility, contraception or pregnancy. 

30. Pharmaceutical composition according to any one of claims 23 to 29 for surgical local application, topical or sys- 
temic application. 

31. Pharmaceutical compostion according to any one of claims 23 to 30 

characterized in that it further contains pharmacologically acceptable auxiliary substances. 

32. Pharmaceutical composition according to any one of claims 30 or 31 , 
characterized in that the composition is injectable. 

33. Pharmaceutical composition according to anyone of claims 30 to 32, 

characterized in that it is contained in a vehicle that allows to direct and release the composition to a determined 
site of action. 

34. Pharmaceutical composition according to claim 33, 

characterized in that the vehicle is selected from liposomes, nanospheres, larger implantable containers and mi- 
cropumps. 

35. Use of a pharmaceutical composition according to any one of claims 23 to 34 for the prevention or treatment of 
any indications of the dimeric form of the protein. 
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Fig. 1 A 



Name: MP52, dimeric form 



Formula C1184H1844N330O350S22 

Molecular weight 26994 Dalton 

Amino acid composition 238 amino acids 

Disulfide bond 7 bonds 



Primary structure 
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Fig. 1 B 
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(1) nucleic acid (II) encoding (I) ; 

(2 ) expression vector ( III ) containing ( II ) in a 
suitable vector system; 

(3) host cell (IV) containing (III) capable of producing 

(I) ; and 

(4) a pharmaceutical composition (V) containing (I) , 

(II) , (III) or (IV) . 

Gene therapy. No supporting data is given. 

USE - (V) is useful for the prevention or therapy of 
diseases for which also the dimeric form of the protein 
would be indicated. Diseases treatable include diseases 
as sociated with bone and/ or cartilage damage or 
affecting bone and/ or cartilage disease or situations in 
which cartilage and/or bone growth is desirable, for 
spinal fusion, for damaged or diseased tissue associated 
with connective tissue including tendon and/or ligament, 
periodontal or dental tissue including dental implants , 
neural tissue including CNS tissue and neuropathological 
situations, tissue of the sensory system, liver, 
pancreas , cardiac, blood vessel, renal, uterine and 
thyroid tissue, skin, mucous membrane , endothelium, 
epithelium, for promotion or induction of nerve growth, 
tissue regeneration, angiogenesis , wound healing 
including ulcers , burns , injuries or skin grafts, 
induction of proliferation of progenitor cells or bone 
marrow cells , for maintenance of a state of 
proliferation or differentiation, for treatment or 
preservation of tissue or cells for organ or tissue 
transplantation, for integrity of gastrointestinal 
lining and for treatment of disturbances in fertility, 
contraception or pregnancy (all claimed) . 

EQUIVALENT— ABSTRACTS : 

BIOTECHNOLOGY 
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Preferred Protein: (I) is a mature protein or a 
biologically active part or its variant and contains at 
least 7 cysteine region characteristic for the TGF-beta 
protein super family . ( I ) contains a consensus sequence 
of formula Cys- (Y) ( 25-2 9 ) -Cys-Y-Y-Cys- (Y) ( 25-35 ) -X-Cys- 
(Y) (27-34) -Cys-Y-Cys or Cys- (Y) 2 8-Cys-Y-Y-Y-Cy s- (Y) (30- 
32) -X-Cys- (Y) 3 1-Cy s-Y-Cys , 

Y = any amino acid 

X = amino acid except cysteine 

The protein is a morpho genet ic protein belonging to the 
TGF-beta, bone morphogenic protein (BMP) , growth 
differentiation factor (GDF ) , activin or glial cell line- 
derived neurotrophic factor (GDNF) family. In (I) the 
cysteine residue is substituted by alanine, serine, 
threonine, leucine, isoleucine, glycine or valine and 
contains additional amino acids that facilitate or 
mediate the transfer and localization of the protein in 
a certain tissue. 

Preferred Nucleic Acid: ( II ) is DNA and further contains 
suitable expression control sequences facilitating and/ 
or driving expression of the encoded protein. The vector 
system containing (II) is suitable for prokaryotic 
expression . 

Preferred Cell: (IV) is a prokaryotic cell, preferably a 
embryonal cell. 

Preferred Composition : (V) contains ( I ) and/ or ( II ) in 
and/or on a biocompatible, biodegradable matrix 
material. The material is itself osteogenically active. 
(V) is an injectable composition or is a vehicle such as 
liposomes , nano sphere s , larger implantable containers or 
micropumps that allows to direct and release the 
composition to a determined site of action. 

Preparation: The monomeric protein can be prepared by 
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standard recombinant methods, in particular by 
expression in prokaryotes . 

(V) is suitable for surgical local application, topical 
or systemic application (claimed). Dosage not specified. 

SPECIFIC PROTEINS 

(I) is selected from MP52 (GDF5) and MP121 (their 
biologically active parts or variants), comprising a 
sequence of 501 and 352 amino acids, respectively, 
encoded by a DNA sequence of 2703 and 2272 base pair 
defined in the specification (claimed) . 

None given. 

TITLE-TERMS: NOVEL MONOMERIC PROTEIN TRANSFORM GROWTH 

FACTOR BETA FAMILY PREVENT THERAPEUTIC 
DISEASE ASSOCIATE BONE CARTILAGE DAMAGE 
PROMOTE WOUND HEAL SUBSTITUTE DELETE 
CYSTEINE 

DERWENT— CLASS : B04 D16 P3 4 

CPI-CODES: B04-E02B; B04-E04; B04-E08; B04-F0100E; 

B04-H01; B04-N02B; B14-F01; B14-F02; 
B14-G02C; B14-J01; B14-J05; B14-N01; 
B14-N06; B14-N07; B14-N12; B14-N16; B14- 
N17; B14-P01; B14-P02; B14-S03; D05- 
C12; D05-H12B2; D05-H12D5; D05-H12E; 
D05-H14; D05-H17; 
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CHEMICAL-CODES: Chemical Indexing Ml *01* Fragmentation 

Code M423 M710 N135 P446 P519 P520 P624 
P714 P721 P723 P923 P941 P942 P943 Q233 
Specific Compounds RA00NS Registry 
Numbers 93605 



Chemical Indexing Ml *02* Fragmentation 
Code M423 M710 N135 P446 P519 P520 P624 
P714 P721 P723 P923 P941 P942 P943 Q233 
Specific Compounds RA0 0GT Registry 
Numbers 200757 200799 



Chemical Indexing Ml *03 * Fragmentation 
Code M423 M710 P446 P519 P520 P624 P714 
P721 P723 P923 P941 P942 P943 Q233 
Specific Compounds RA036 5 Registry 
Numbers 109233 



Chemical Indexing Ml * 0 4 * Fragmentation 
Code M423 M710 P446 P519 P520 P624 P714 
P721 P723 P923 P941 P942 P943 Q233 
Specific Compounds RA1K5Q Registry 
Numbers 274369 



Chemical Indexing Ml *05 * Fragmentation 
Code M423 M710 P446 P519 P520 P624 P714 
P721 P723 P923 P941 P942 P943 Q233 
Specific Compounds RA03QY Registry 
Numbers 86535 



SECONDARY-ACC-NO : 

CPI Secondary Accession Numbers: 2001-068236 
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